Vector Space Models of Lexical Meaning
نویسنده
چکیده
Much of this Handbook is based on ideas from Formal Semantics, in which the meanings of phrases or sentences are represented in terms of set-theoretic models. The key intuition behind Formal Semantics, very roughly, is that the world is full of objects; objects have properties; and relations hold between objects. Set-theoretic models are ideal for capturing this intuition, and have been succcessful at providing formal descriptions of key elements of natural language semantics, for example quantification. 1 This approach has also proven attractive for Computational Semantics – the discipline concerned with representing , and reasoning with, the meanings of natural language utterances using a computer. One reason is that the formalisms used in the set-theoretic approaches , e.g. first-order predicate calculus, have well-defined inference mechanisms which can be implemented on a computer (Blackburn & Bos, 2005). The approach to natural language semantics taken in this chapter will be rather different, and will use a different branch of mathematics from the set theory employed in most studies in Formal Semantics, namely the mathematical framework of vector spaces and linear algebra. The attraction of using vector spaces is that they provide a natural mechanism for talking about distance and similarity, concepts from geometry. Why should a geometric approach to modelling natural language semantics be appropriate? There are many aspects of semantics, particularly lexical semantics, which require a notion of distance. For example, the meaning of the word cat is closer to the meaning of the word dog than the meaning of the word car. The modelling of such distances is now commonplace in Computational Linguistics, since many examples of language technology benefit from knowing how word meanings are related geometrically; for example, a search engine could expand the range of web pages being returned for a set of query terms by considering additional terms which are close in meaning to those in the query. The meanings of words have largely been neglected in Formal Semantics, typically being represented as atomic entities such as dog , whose interpretation is to denote some object (or set of objects) in a set-theoretic model. In this framework semantic relations among lexical items are encoded in meaning postulates, which are constraints on possible models. In this chapter the meanings of words will be represented using vectors, as part of a high-dimensional " semantic space ". The fine-grained structure of this space is provided by considering the contexts …
منابع مشابه
Multi-Prototype Vector-Space Models of Word Meaning
Current vector-space models of lexical semantics create a single “prototype” vector to represent the meaning of a word. However, due to lexical ambiguity, encoding word meaning with a single vector is problematic. This paper presents a method that uses clustering to produce multiple “sense-specific” vectors for each word. This approach provides a context-dependent vector representation of word ...
متن کاملDiscovering Stylistic Variations in Distributional Vector Space Models via Lexical Paraphrases
Detecting and analyzing stylistic variation in language is relevant to diverse Natural Language Processing applications. In this work, we investigate whether salient dimensions of style variations are embedded in standard distributional vector spaces of word meaning. We hypothesize that distances between embeddings of lexical paraphrases can help isolate style from meaning variations and help i...
متن کاملConstructing Semantic Space Models from Parsed Corpora
Traditional vector-based models use word co-occurrence counts from large corpora to represent lexical meaning. In this paper we present a novel approach for constructing semantic spaces that takes syntactic relations into account. We introduce a formalisation for this class of models and evaluate their adequacy on two modelling tasks: semantic priming and automatic discrimination of lexical rel...
متن کاملBell States and Negative Sentences in the Distributed Model of Meaning
We use Bell states to provide compositional distributed meaning for negative sentences of English. The lexical meaning of each word of the sentence is a context vector obtained within the distributed model of meaning. The meaning of the sentence lives within the tensor space of the vector spaces of the words. Mathematically speaking, the meaning of a sentence is the image of a quantizing functo...
متن کاملIn Defense of Spatial Models of Lexical Semantics
Semantic space models of lexical semantics learn vector representations for words by observing statistical redundancies in a text corpus. A word’s meaning is represented as a point in a high-dimensional semantic space. However, these spatial models have difficulty simulating human free association data due to the constraints placed upon them by metric axioms which appear to be violated in assoc...
متن کاملMeasuring Word Relatedness Using Heterogeneous Vector Space Models
Noticing that different information sources often provide complementary coverage of word sense and meaning, we propose a simple and yet effective strategy for measuring lexical semantics. Our model consists of a committee of vector space models built on a text corpus, Web search results and thesauruses, and measures the semantic word relatedness using the averaged cosine similarity scores. Desp...
متن کامل